AITopics | similarity search

Collaborating Authors

similarity search

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Practical Near Neighbor Search via Group Testing

Neural Information Processing SystemsApr-25-2026, 22:18:03 GMT

We present a new algorithm for the approximate near neighbor problem that combines classical ideas from group testing with locality-sensitive hashing (LSH). We reduce the near neighbor search problem to a group testing problem by designating neighbors as "positives," non-neighbors as "negatives," and approximate membership queries as group tests.

artificial intelligence, data mining, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States > Texas (0.14)

Industry: Information Technology (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Add feedback

MUVERA: Multi-Vector Retrieval via Fixed Dimensional Encoding

Neural Information Processing SystemsMar-22-2026, 04:40:41 GMT

Neural embedding models have become a fundamental component of modern information retrieval (IR) pipelines. These models produce a single embedding $x \in \mathbb{R}^d$ per data-point, allowing for fast retrieval via highly optimized maximum inner product search (MIPS) algorithms. Recently, beginning with the landmark ColBERT paper, multi-vector models, which produce a set of embedding per data point, have achieved markedly superior performance for IR tasks. Unfortunately, using these models for IR is computationally expensive due to the increased complexity of multi-vector retrieval and scoring. In this paper, we introduce MUVERA (MUlti-VEctor Retrieval Algorithm), a retrieval mechanism which reduces multi-vector similarity search to single-vector similarity search.

artificial intelligence, name change, proceedings, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.39)

Add feedback

Norm-Ranging LSH for Maximum Inner Product Search

Xiao Yan, Jinfeng Li, Xinyan Dai, Hongzhi Chen, James Cheng

Neural Information Processing SystemsFeb-14-2026, 07:08:09 GMT

MIPS is a challenging problem as modern datasets often have high dimensionality and large cardinality. Initially, tree-based methods [Ram and Gray, 2012, Koenigstein et al., 2012] were proposed for MIPS, which use the idea of branch and bound similar to k-d tree [Friedman and Tukey, 1974].

artificial intelligence, information retrieval, natural language, (19 more...)

Neural Information Processing Systems

Country:

Asia > China > Hong Kong (0.04)
North America > Canada > Quebec > Montreal (0.04)
Asia > Afghanistan > Parwan Province > Charikar (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.48)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.46)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.34)

Add feedback

Fast Similarity Search via Optimal Sparse Lifting

Wenye Li, Jingwei Mao, Yin Zhang, Shuguang Cui

Neural Information Processing SystemsFeb-14-2026, 01:26:31 GMT

Neural Information Processing Systems http://nips.cc/

algorithm, dimension, vector, (16 more...)

Neural Information Processing Systems

Country:

Asia > China > Guangdong Province > Shenzhen (0.05)
North America > Canada > Quebec > Montreal (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.97)
Information Technology > Artificial Intelligence > Natural Language (0.69)

Add feedback

Random Projections with Asymmetric Quantization

Xiaoyun Li, Ping Li

Neural Information Processing SystemsFeb-13-2026, 08:20:28 GMT

Neural Information Processing Systems http://nips.cc/

estimator, projection, random projection, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > Afghanistan > Parwan Province > Charikar (0.04)
North America > United States > Texas > Dallas County > Dallas (0.04)
(14 more...)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

EMBERSim: A Large-Scale Databank for Boosting Similarity Search in Malware Analysis Dragos

Neural Information Processing SystemsFeb-11-2026, 22:57:28 GMT

Moreover, we observe that the focus in the few related works falls on quantifying similarity in malware, often overlooking the clean data. This one-sided quantification is especially dangerous in the context of detection bypass.

data mining, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Genre: Research Report (0.46)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.68)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)
(4 more...)

Add feedback

5248e5118c84beea359b6ea385393661-Paper.pdf

Neural Information Processing SystemsFeb-8-2026, 16:36:50 GMT

algorithm, neighbor, query, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Harris County > Houston (0.04)
Europe > Poland (0.04)
Asia > Afghanistan > Parwan Province > Charikar (0.04)

Industry: Information Technology (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.99)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.94)

Add feedback

EMBERSim: A Large-Scale Databank for Boosting Similarity Search in Malware Analysis

Neural Information Processing SystemsDec-25-2025, 07:44:51 GMT

In recent years there has been a shift from heuristics based malware detection towards machine learning, which proves to be more robust in the current heavily adversarial threat landscape. While we acknowledge machine learning to be better equipped to mine for patterns in the increasingly high amounts of similar-looking files, we also note a remarkable scarcity of the data available for similarity targeted research. Moreover, we observe that the focus in the few related works falls on quantifying similarity in malware, often overlooking the clean data. This one-sided quantification is especially dangerous in the context of detection bypass. We propose to address the deficiencies in the space of similarity research on binary files, starting from EMBER -- one of the largest malware classification datasets. We enhance EMBER with similarity information as well as malware class tags, to enable further research in the similarity space. Our contribution is threefold: (1) we publish EMBERSim, an augmented version of EMBER, that includes similarity informed tags; (2) we enrich EMBERSim with automatically determined malware class tags using the open-source tool AVClass on VirusTotal data and (3) we describe and share the implementation for our class scoring technique and leaf similarity method.

embersim, large-scale databank, similarity search, (4 more...)

Neural Information Processing Systems

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.63)

Add feedback

Norm-Ranging LSH for Maximum Inner Product Search

Xiao Yan, Jinfeng Li, Xinyan Dai, Hongzhi Chen, James Cheng

Neural Information Processing SystemsNov-20-2025, 19:31:59 GMT

artificial intelligence, information retrieval, natural language, (19 more...)

Neural Information Processing Systems

Country:

Asia > China > Hong Kong (0.04)
North America > Canada > Quebec > Montreal (0.04)
Asia > Afghanistan > Parwan Province > Charikar (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.48)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.46)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.34)

Add feedback

Fast Similarity Search via Optimal Sparse Lifting

Wenye Li, Jingwei Mao, Yin Zhang, Shuguang Cui

Neural Information Processing SystemsNov-20-2025, 19:06:05 GMT

Similarity search is a fundamental problem in computing science with various applications and has attracted significant research attention, especially in large-scale search with high dimensions.

artificial intelligence, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country: